-
Notifications
You must be signed in to change notification settings - Fork 149
update nccl-tests.yaml paths #878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
amanshanbhag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good. Please also make the same changes to
|
@bluecrayon52 can we close out on comments and merge? |
d242b10 to
8aea72f
Compare
|
Apologies for the delay on getting back this. Thankfully, it looks like @erezzarum had some helpful insights into how we can clean things up. How are new builds of the I was able to confirm that the following defaults were set in However, using the current build of Likewise, I also had to keep
The example of MPIJob that @erezzarum provided in #881 is lean, and I'm inclined to follow the same pattern once a new build of the I would make same changes to nccl-tests-gb200.yaml, but given that it uses a separate ARM image ( |
|
It seems the latest version of nccl-tests container image is not yet reflecting my changes. From the Dockerfile. This is an example of what i tested with |
Issue #, if available:
Description of changes:
Updated OFI NCCL plugin path to
/opt/amazon/ofi-nccl/lib/x86_64-linux-gnuUpdated CUDA toolkit path to
/usr/local/cuda/lib64Updated OFI NCCL tuner plugin path to
/opt/amazon/ofi-nccl/lib/x86_64-linux-gnu/libnccl-ofi-tuner.soBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.